NeuralSeek Setup

Set Up

Once you've opened NeuralSeek, you'll land on the "Configure" tab.
Enter the name of the company or organization that NeuralSeek will be generating answers for. Click "Next"

KnowledgeBase Connection

Input Watson Discovery KnowledgeBase details. Discovery API Key and Discovery Service URL can be found on the Watson Discovery service instance page. Once you have created a project and a collection in Watson Discovery, the Discovery Project ID can be found on the integration tab under API Information. Once you fill out the KnowledgeBase Connection details, test the connection by clicking "Test". Once tested, click "Next". The button will turn green when it successfully connects to Watson Discovery.
For Virtual Agent Type, select "Watson Assistant Type". Click "Next". Click "Next".

LLM Details

In the LLM Details, you must add at least one LLM. If you choose to add multiple, NeuralSeek will load-balance across them for the selected functions that have multiple LLM's. For API key, there are two options: SaaS and CP4D. You only need to fill out one or the other. Ensure your endpoint and project id are accurate. LLM Details

In this case, we are interested in CP4D. The CP4D api key must be a base64 encoded key, and it can be found using the terminal comamnd below. You can generate 'myapikey' by clicking on the profile picture on the top right hand corner and then clicking 'Generate new key'.

printf "myusername:myapikey" | base64

You can confirm if you succeeded in connecting with your zenapikey by using the curl command below:

curl -H "Authorization: ZenApiKey ${TOKEN}" "https://<cpd_instance_route>/<endpoint>"

Credentials from watsonx.ai that you need for LLM details: First make sure you have a project created within watsonx.ai.

LLM Endpoint
- Go to WatsonX Platform
- Prompt Lab -> View code (Right hand side, to the right of Model) -> Copy the API endpoint after curl
LLM Project ID:
- Go to WatsonX Platform
- Projects -> Manage -> General -> Details -> Project ID

Configuration & Tuning

Please see below for recommended settings:

Knowledgebase Connection:
- Curation Data Field: text
- Link Field: url
- Document Name Field: extracted_metadata.title
- Attribute sources inside LLM Conext by Document Name: Enabled
- Return the full document instead of passages (only enable this if all of your documents are short): Enabled
Knowledge Base Tuning:
- Document Score Range: 0.8
- Max Documents Per Seek: 1
- Document Data Penalty: 0
- Snippet Size: 1000
- KnowledgeBase Query Cache (minutes): 0
Prompt Engineering
- Weight Tuning Section
  - Temperature: 0
  - Top Probability: 0.7
  - Frequency Penalty: -0.3
  - Maximum Tokens: 0
Answer Engineering & Preferences
- For "How verbose should an answer be", select second tier using the slider.
Intent Matching & Cache Configuration
- Edited answer cache: 3
- Normal answer cache: 5
- Required Cache to Follow Context? Yes
- Required Cache to match the exact KB for the question and not the intent? No
Governance & Guardrails
- For the Semantic Score section, please turn on the following using the toggle:
  - Enable the Semantic Score Model
  - Use Semantic Score as the basis for Warning & Minimum confidence. Do NOT Enable for usecases requiring language translation.
  - Rerank the search results based on the Semantic Match
- Semantic Tuning
  - Missing key search term penalty: 0.4
  - Missing search term penalty: 0.25
  - Source jump penalty: 6
  - Total coverage weight: 0.25
  - Rerank min coverage %: 0.25
- Warning confidence
  - Confidence % for warning: 5
- Minimum Confidence
  - Minimum Confidence %: 0
  - Minimum Confidence % to display a URL: 5
  - Minimum Text: 0
  - Maximum Length: 100

Testing

Navigate to "Seek" tab. Test NeuralSeek with questions that are relevant to your documents, e.g. "What products or services do you offer?" You will be able to see the NeuralSeek answer with response details, metrics, and source. Ensure to clear session turn if starting a new session by clicking on the red reset icon:

In addition to testing on NeuralSeek, we have written a script to allow testing through API for more flexibility. We performed Pre-Processing and No OCR, No Pre-Processing and No OCR, and OCR experiments using the testing notebook. You can and run the different experiments just by changing the Discovery collection ID and providing with the questions and expected responses as string arrays. It uses the NeuralSeek API.

UI Testing

Within the NeuralSeek UI, there is also a feature to send a batch of questions to test at once. This can be helpful for users who prefer UI tools, and returns answers in a spreadhseet format. To use this feature, navigate to the "upload test questions" section on the home page: UI Testing